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Abstract — Let C(d) be the capacity of the binary deletion 
channel with deletion probability d. It was proved by Drinea and 
Mitzenmacher that, for all d, C(d)/(1 - d) > 0.1185. Fertonani 
and Duman recently showed that limsup d ^ 1 C(d)/(l—d) < 0.49. 
In this paper, it is proved that limd-s-i C(d)/(1 — d) exists and 
is equal to inf^ C(d)/(1 — d). This result suggests the conjecture 
that the curve C(d) my be convex in the interval d £ [0,1]. 
Furthermore, using currently known bounds for C(d), it leads 
to the upper bound limd-n C(d)/(1 - d) < 0.4143. 

I. Introduction 

A binary deletion channel W d is defined as a binary 
channel that drops bits of the input sequence independently 
with probability d. Those bits that are not dropped simply 
pass through the channel unaltered. While simple to describe, 
the deletion channel proves to be very difficult to analyze. 
Dobrushin ([1]) showed that for such a channel it is possible 
to define a capacity C(d) and that a Shannon like theorem 
applies to this channel. However, no closed formula expression 
is known up to now for the capacity C(d), and only upper and 
lower bounds are currently available (see [2], [3], [4], [5], [6]). 

For small values of d, it was recently independently proved 
in [4] and [5] that C(d) « l-H(d), where H(d) is the binary 
entropy function. For values of d close to 1, it is known (see 
[7], [6]) that C(d) satisfies 

C(d) C(d) 
0.1185 < liminf — —, < limsup < 0.49 (1) 

d— s-i 1 — d d-^i i- — d 

As far as the author knows, there is no result in the literature 
on the existence of limd^-i C(d)/(1 — d). In this paper, it is 
proved that the limit exists and, in particular, that 



lim 



C{d) 



C{d) 



= inf ■ 

d-n (I — d) d (1 - d) 



(2) 



The best currently known upper bound for C(d), when used 
in the right hand side of (|2j, leads to the upper bound 



l im SML < 0.4143, 



(3) 



d^i (1 - d) 

which improves the best previously known bound of equation 
([TJ. Furthermore, equation Q suggests the conjecture that 
C(d) may be a convex function of d. Indeed, as discussed 
in Section IV below, experimental evidence (see Figure [T| 
suggests the convexity of C(d) for values of d sufficiently 
smaller than 1, while it is not easy to exclude that the 
function may be concave near d = 1. Equation Q is only 



a necessary condition^] for the convexity of C(d) near d=l. 
It is, however, sufficient to conclude that C(d) is not strictly 
concave in any neighborhood of d = 1. Thus, either C(d) 
exhibit a pathological behavior near d = 1, or it is convex in 
a sufficiently small neighborhood of d = 1. A proof of the 
convexity of C{d) would of course imply equation ([2} and 
thus equation ((3). 

The main idea used in this paper is the intuitive fact that, for 
a large enough number of input bits n, the deletion channel 
W d is fairly well approximated by a channel which drops 
exactly [dn] bits selected uniformly at random. In particular, 
we show that a channel W n .k with n-bits input and /c-bits 
output, selected uniformly within the fc-bits subsequences of 
the input, has a capacity that is close to C(l — k/n) for large 
enough n. Using this result, we build upon the work in [6] to 
prove p}. 

II. Definition and regularity of C(d) 

For any i and j, let X\ — (X,-, Xj+i, . . . , Xj) and, similarly 

Y? = (Yi, Y i+ i , Yj). Let Wfi be a channel with an n-bit 

string input whose output is obtained by dropping the bits of 
the input independently with probability d. Let then 



C n (d) = -m^I(X?;W d (X{ 1 )). 

n p x ? 



(4) 



It was proved by Dobrushin [1] that a transmission capacity 
C(d) can be consistently defined for the deletion channel W d 
and that it holds 



C{d) = lim C n (d). 



(5) 



Figure [T] shows the graph of the C n (d) functions for n = 
1, . . . , 17. The main objective of this section is to study the 
convergence of the C n (d) functions to deduce a regularity 
result for C(d). 

The following lemma gives a quantitative bound on the rate 
of convergence in Q. 

Lemma 1: (see also [1], [4], [6]) For every d E [0, 1] and 
n > 1 

C n (d) - l0g(n+1) < C(d) < C n (d) . (6) 



'it is not difficult to construct examples of "pathological" functions f(d) 
that satisfy equation {2}, when used in place of C(d), but are not convex in 
any neighborhood of a = 1, 




Fig. 1. Plot of the C n (d) functions for n = 1 ... 17 obtained by numerical 
evaluations in [6]. 



Proof: As observed in [4], nC n (d) is a subadditive 
function of n. In fact, for an input AT" +m , let Y (0 ) = W^(Xf ) 
and Y (1) = W£(-X#T). Note that Y = ^ +m JX?+ ro )_can 
be obtained as a concatenation of the strings Y(o) and Yny 
Thus, Xf +m (Y (0) ,Y (1) ) -> Y is a Markov chain. Hence, 

n+m. 



(n + m)C n+m (d) 



max /(XJ 1 



< max I(X{ 



;(y (0) ,y (1) )) 



< nC n (d) + mC m {d). 

This implies by Fekete's lemma (see [8, Prob. 98]) that the 
limit C(d) = lirOri-^oo C n (d) exists and it satisfies C(d) = 
inf„>i C n {d). This proves the right hand side inequality. 

Take now an integer h > 1 and consider, for an input 
Xi n , the output Y = W^ n (X^ n ) as the concatenation of 
the h outputs Y (i) = W*(X%Xi)> * = 0, . . . , ft - 1. Let for 

. , Y(/(_i)). It is clear that 



convenience Y-S 1 ^ = (Y(q),Y(x 



of Y 



Y 



(o) 
(h-i) 



(i) 



— !■ Y is a Markov Chain. Let Li be the length 
We thus have 



hnC hn {d) 



maxI(X^ n ; Y) 

P x hn 

= max[/(4«; Y^ 1 ') - I(X^; | Y)] 



(0) 



> max[/(Xf";Y^-^) -^(Y^-^IY)] 



P x h„ 



max[/(X 1 ' l ";Y ( ( ';- 1) ) 



ff^-^Y)] 



> max/(^";Y ( / ; 1] ) - (h - 1) log(n + 1) 



- (0) 



hnC n {d) -(h-1) log(n + 1). 



Hence 



C(d) 



lim C hn (d) 

h— too 



> 



lim 



C„(d) 



/i - 1 log(n + 1) 
/i n 



= G n (d)- 



log(n + 1) 



See [6, eq. (39)] for tighter, though more complicated, bound. 

As a consequence of Lemma [T] we have the following 
regularity result for C(d). 

Lemma 2: The function C(d) is uniformly continuous in 
[0,1]. Thus, for every f3 > there is a a = a((3) such that 
\d 1 -d 2 \ <a^\C{d 1 )-C{d 2 )\ <p. 

Proof: As shown in Lemma [T] the functions C n (d) tend 
to C(d) uniformly in d. Hence, if proved that the C n (d) are 
continuous in d, so is their limit C(d). Since the domain of 
C(d) is compact, by the Heine-Cantor theorem C(d) is also 
uniformly continuous. That the C n (d) functions are continuous 
is really intuitive; the shortest formal proof that we were 
able to provide goes as follows. The entries of the transition 
matrix of the channel are polynomials in d and thus the 
mutual information /(X"; W^f(X")) is a continuous function 
of d and of the input distribution px?- Hence, by moving d 
continuously from to 1 one expects the capacity to change 
continuously from 1 to 0. A formal proof, however, seems 
to require using the compactness of the sets of distributions 
Px™- Assume that C n (d) is not continuous in d = d and let 
p be the input distribution that attains the value C n (d). Then 
there exists an e > such that \C n (d) — C n (dk)\ > £ for 
a sequence dk converging to d. Consider the distributions pk 
that attain C n (dk). Since the set of the px™ is bounded and 
closed, there exists a subsequence of the pk that converges to 
a distribution p' . By continuity of the mutual information the 
C n (dk) values tend to the mutual information I' attained by 
p' in d = d. But, by definition of C n (d), we clearly have that 
I' < C n (d) and thus C n {dk) < C n {d) — e for k large enough. 
But then the mutual information attained by p in dk tends to 
C n (d) > C n {dk) + s for large enough k, which is absurd by 
definition of C n (dk)- ■ 

III. Exact deletion channel 

Let now W n ,k, k < n, be a channel with n-bits input whose 
output is uniformly chosen within the (?) fc-bits subsequences 
of the input. This channel was efficiently used as an auxiliary 
channel in [5], [6]. Let then 

C n . k = - max J(X?; W n , k (X?)). (7) 

n Pxn 

The following obvious result will be used later. 
Lemma 3: For every random X", if fci > k 2 then 

I{X^W nM {X?)) > I(X?;W nM {X?)). (8) 

Proof: Simply note that the W n ,k 2 channel can be 
obtained as a cascade of Wn.^ and Wk lt k 2 - Thus, Xf — > 
W n ,ki(Xi) -> W n ,k 2 (Xi) is a Markov chain and the lemma 
follows from the data processing inequality. ■ 
The following lemma bounds the capacity of the W% 
channel in terms of the capacity of certain exact deletion 
channels. 

Lemma 4: For every e > 0, d € [e, 1 — e], and n > 1 



a 



n,[(i-d-e)n]- 2e 2e " < C n (d) < C„. +2e 2e n . 

(9) 



Proof: We first prove the right hand side inequality. 
For an input X[ l , let Y = W*(X?) and let L = \Y\ 
be the length of Y. First note that Xf -> 7 -> I is a 
Markov chain. So, by applying the chain rule to I(Xi ; Y, L), 
considered that 7(X™;L) = since L is independent from 
X[\ it is easily seen that I(X?;Y) = I(X?;Y\L). Define 
T = {j : |i - (1 - d)\ < e}, that is j e T if and only 
if [(1 - d - n e)n] < j < [(1 - d + e)n\. Let now X? be 
distributed according to the optimal distribution for the W„ 
channel. Then we have 



nC n (d) 



I(X?;Y\L) 

n 

J2PL(j)I(X?;Y\L = j) 
j=o 

Y,PL{j)I{X?;Y\L=j) 



-Y / PL(mX{ l - 1 Y\L = j) 



(«) 

< 



*£p L (j)I(X?;Y\L=l(l-d + e)n\] 
+ ^2pL{j)n 



< 



< 



iC„ 



n,[(l-d+e)n] 



^2pL{j)+n^2p L (j) 



3&T 



-2e z n 



where (a) follows from Lemma [5] and the definition of T and 
(b) follows from the Chernoff bound. Dividing by n we get 
the desired inequality. 

As for the left hand side inequality, let now X" be 
distributed according to the optimal distribution for the 
Wji, r(i_d- e )n] channel. Then we have 



nC n (d) 



> 



I(X?;Y\L) 

n 

j2PL(m^-,Y\L=j) 

Y,PL(j)I{X?;Y\L=j) 



(a) 
> 



(6) 
> 

(c) 
> 



+ J2pdj)I(X?;Y\L = j) 
^ Pi (j)/(X™;F|i=r(l-d- £ )nl) 

2e 2 n 

n Cn,[(l-d+e)n] (1 ~ 2e^ £ ") 



n,f(l— d+e)nl 



2ne 



-2e A n 



where (a) follows again from Lemma [3] (6) follows from 
the Chernoff bound, and (c) follows from the obvious fact 



that C n r(i_d+ e ) n ] < 1. Dividing by n the desired result is 
obtained. ■ 
The following lemma bounds the capacity of the exact 
deletion channel W n ,k m terms of C(d) for appropriate values 
of d. 

Lemma 5: For every e > and integers n and k 

C (1 - k/n + e) - 2e' 2e2n < C n , fe <C(l-k/n- e) 

+ 2e -^"+ l0g(n + 1) . (10) 
n 

Proof: Take d = 1 — k/n — e in Lemma [4] to obtain 
C n , k < C n (l - k/n -e) + 2e~ 2e2n < C(l - k/n - s) + 
2 e -2e n _|_ \og(n + l)/n, by virtue of Lemma [T] Then take 
d = l — k/n + e in Lemma|4]to obtain C n ^ > C„(l — k/n + 
e)-2e- 2£2n >C(l-k/n + e)~2e- 2£2 ' n . ■ 
Lemma 6: For every /3 > 0, there is an n = n(/3) such that 

|Cn, fc - C(l - < P Vn>fi,k = l,...,n. (11) 

Proof: First note that, for e > 0, C(l — fc/n + e) < 
C(l - fc/n) < C(l - fc/n - e). Hence, C(l - fc/n) satisfies 
the two inequalities satisfied by C„.fc in equation ( fT0| ). So, 
|Cn,fc — C(l — k/n)\ is bounded by the difference between the 
right hand side and the left hand side of equation ( fTO] ), that is 

\C n , k - C(l - k/n)\ <C{l-k/n-e)-C{l- k/n + e) 

+ 4e _ 2£2 „ + log(n+l) _ 

With the notation of Lemma |5] take e < a(f3 /2)/2 so that 
C (1 - k/n - e) - C (1 - k/n + e) < /3/2. Once £ is fixed, 
choose h such that 4e~ 2e2 ™ + '° s( ^ +1) < /3/2 to complete the 
proof. Note that n is a function of /3 only and that the result 
holds for every k < n. ■ 

We can now state the first result of this paper. 

Theorem 1: Let k n be an integer valued sequence such that 
k n /n tends to 1 — d as n goes to infinity. Then 

lim C n , kn = C(d). (13) 

n— >oo 

Proof: It follows easily from Lemma [6] by continuity of 



C(d). 



IV. Behavior near d = 1 



In this Section, we finally focus on the behavior of the 
function C(d) for values of d close to 1. It is interesting 
to observe in Figure [T] that, from experimental evidence, 
the C n (d) functions seem to be convex in a progressively 
expanding region of d values. On the one hand, it is tempting 
to conjecture that the limit C(d) is convex in the whole interval 
d G [0, 1]. On the other hand, near d = 1, all the C n (d) curves 
appear to change concavity and go to zero asymptotically as 
(l — o?). Indeed, we have the following result. 

Lemma 7: For every n, 



lim 



C n {d) 

i (i - d) 



(14) 



Proof: It is easily shown that for every n and d 



Corollary 1: 



(l-d n )/n<C n (d) < (1-d). 



(15) 



The right hand side inequality follows from the fact that the 
capacity of W£ is obviously smaller than the capacity of a 
binary erasure channel with erasure probability d. To prove 
the left hand side inequality consider using as input to the 
channel W„ only the sequence composed of n zeros and that 
composed of n ones. Then the n uses of W„ correspond to 
one use of an erasure channel with erasure probability d n . This 
proves equation ( p"5j ). Dividing by (1 — d) and taking the limit 
d — > 1 gives the required result. ■ 
Lemma [JJ ensures that, for fixed n, C n {d) is not convex in 
a neighborhood of d — 1. Note further that 



,. C n (d) 
lim t 

eM-l (1 - 0) 



sup 

de(o,i) 



C n (d) 
(1-d) 



= 1 



(16) 



Hence, it is natural to believe that C n (d) is actually concave 
in a neighborhood of d = 1, even if Lemma [7] is not sufficient 
to prove this. However, in the limit n — > oo, it is known (see 
[7], [6]) that C(d) satisfies 



0.1185 < liminf < lim sup < 0.49 

d-yl 1 — d rf_s.i 1 — d 



(17) 



Hence, Lemma [JJ does not hold with C(d) in place of C n (d) 
and it is still legitimate to conjecture that C(d) may be convex 
in [0, 1]. The next step is thus to ask if C n (d)/(1 — d) has a 
limit as d — > 1 and, if so, if this limit is reached from above as 
would be implied by convexity of C(d). The remaining part 
of this section tries to answer this question. 

In order to understand the behavior of C(d) near d = 1, the 
following result from [6] is fundamental. 

Lemma 8 (Fertonani and Duman, [6, eq. (32)]): For every 
n, k 



y C(d) 

lim sup 

d-^l 1 — d 



< 



nC n ^k + 1 
k + l 



(18) 



Remark 1: In [6] the authors state that, for every n and 



k, limd^x 2^ 



< nC , n \\ +1 , However, we are not aware of 



fe+i 



C(d) 



a previous formal proof that lim^i r 
proved in the following theorem. 
Theorem 2: It holds that 



exists. This fact is 



lim 



C(d) 



inf 



C(d) 



(19) 



d->-i (1 - d) de(o,i) 1 — d 

Proof: For every d! £ (0, 1), let k n be a sequence 
such that k n /n tends to 1 — d'. Then, from Theorem [Tj 
the right hand side of ( [18) , with k n in place of k, tends to 
C(d')/(\ — d'). Since d! is arbitrary, Lemma [8] implies that 
limsup d ^. 1 (7(d)/(l — d) < inLj/ G ( ,i) ■ H° wever , it is 
obvious that lim inf C(d)/(l—d) > inLj/ e m,i) j~d> • Thus 
lim^i C(d)/(1 — d) exists and is equal to infd'g(o,i) t~37" 

■ 

A direct consequence of Theorem [2] is the following im- 
proved bound on C(d). 



l im Jl^L < 0.4143. 



(20) 



d-n (1 - d) 

Proof: As far as the author knows, the best known 
numerical bound obtained for infrf C(d)/(1 — d) is 0.4143 
obtained using the bound C(0.65) < d 7 (0.65) = 0.145, 
numerically evaluated in [6]. ■ 
The usefulness of Theorem [2] is that it allows to deduce 
provable bounds for lim^i jx^i) from bounds on C(d) even 
with d much smaller than 1. It is interesting to note, in fact, 
that different techniques seem to be effective in bounding C(d) 
in different regions of the interval [0, 1]. For example, different 
genie aided channels are used in [6] for smaller values of d 
than for large values of d and, while equation ( fTS) is derived in 
[6] using a bound effective for large d, the bound for C(0.65) 
used in Corollary [T] is derived from the numerical value of 
C 17(d) which is not as effective for d larger than 0.8 (see Table 
IV in [6], where bound C4 therein is what we called Ci7(d), 
while bound is used to deduce (p~8]>). Thus, in order to 
obtain improved upper bounds for lim^i 



C(d) 

(i-d) 



one effective 

approach would be to numerically evaluate C n (d) near d = 
0.65 for n > 18. This requires, however, high computational 
and spatial complexity and it is out of the scope of the present 
paper. 
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